86 research outputs found
Dynasor: A Dynamic Memory Layout for Accelerating Sparse MTTKRP for Tensor Decomposition on Multi-core CPU
Sparse Matricized Tensor Times Khatri-Rao Product (spMTTKRP) is the most
time-consuming compute kernel in sparse tensor decomposition. In this paper, we
introduce a novel algorithm to minimize the execution time of spMTTKRP across
all modes of an input tensor on multi-core CPU platform. The proposed algorithm
leverages the FLYCOO tensor format to exploit data locality in external memory
accesses. It effectively utilizes computational resources by enabling lock-free
concurrent processing of independent partitions of the input tensor. The
proposed partitioning ensures load balancing among CPU threads. Our dynamic
tensor remapping technique leads to reduced communication overhead along all
the modes. On widely used real-world tensors, our work achieves 2.12x - 9.01x
speedup in total execution time across all modes compared with the
state-of-the-art CPU implementations
- ā¦